Use of Semantic Knowledge Base for Enhancement of Coherence of Code-mixed Topic-Based Aspect Clusters

نویسندگان

  • Kavita Asnani
  • Jyoti D. Pawar
چکیده

In social media code-mixing is getting very popular due to which there is enormous generation of noisy and sparse multilingual text which exhibits high dispersion of useful topics which people discuss. Also, the semantics is expressed across random occurrence of code-mixed words. In this paper, we propose code-mixed knowledge based LDA (cmkLDA), which infers latent topic based aspects from code-mixed social media data. We experimented on FIRE 2014, a codemixed corpus and showed that with the help of semantic knowledge from multilingual external knowledge base, cmkLDA learns coherent topic-based aspects across languages and improves topic interpretibility and topic distinctiveness better than the baseline models . The same is shown to have agreed with human judg-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Single Walled Carbon Nanotube Effects on Mixed Convection heat Transfer in an Enclosure: a LBM Approach

The effects of Single Walled Carbon Nanotube (SWCNT) on mixed convection in a cavity are investigated numerically. The problem is studied for different Richardson numbers (0.1-10), volume fractions of nanotubes (0-1%), and aspect ratio of the cavity (0.5-2.5) when the Grashof number is equal to 103. The volume fraction of added nanotubes to Water as base fluid are lowers than 1% to make dilute ...

متن کامل

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

  One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

Speckle Noise Reduction for the Enhancement of Retinal Layers in Optical Coherence Tomography Images

Introduction One of the most important pre-processing steps in optical coherence tomography (OCT) is reducing speckle noise, resulting from multiple scattering of tissues, which degrades the quality of OCT images. Materials and Methods The present study focused on speckle noise reduction and edge detection techniques. Statistical filters with different masks and noise variances were applied on ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016